Bootstrapping Lexicalized Models in Memory-Based Dependency Parsing

ثبت نشده
چکیده

Previous research has shown that a lexicalized parsing model incorporating words but no parts-of-speech can outperform a model involving partsof-speech but no words given enough training data for supervised learning. We show that the same effect can be achieved with a bootstrapping approach, where a mixed model trained on a small treebank is used to parse a larger corpus which is used as training data for the lexicalized model. The results are obtained using a memory-based learning algorithm for deterministic dependency parsing of unrestricted Swedish text.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

Parsing with Lexicalized Probabilistic Recursive Transition Networks

We present a formalization of lexicalized Recursive Transition Networks which we call Automaton-Based Generative Dependency Grammar (gdg). We show how to extract a gdg from a syntactically annotated corpus, present a chart parser for gdg, and discuss different probabilistic models which are directly implemented in the finite automata and do not affect the parser.

متن کامل

The Institute For Research In Cognitive Science Disambiguation of Super Parts

In a lexicalized grammar formalism such as Lexicalized Tree-Adjoining Grammar (LTAG), each lexical item is associated with at least one elementary structure (supertag) that localizes syntactic and semantic dependencies. Thus a parser for a lexicalized grammar must search a large set of supertags to choose the right ones to combine for the parse of the sentence. We present techniques for disambi...

متن کامل

Disambiguation of Super Parts of Speech ( or Supertags ) : Almost

In a lexicalized grammar formalism such as Lexicalized Tree-Adjoining Grammar (LTAG), each lexical item is associated with at least one elementary structure (supertag) that localizes syntactic and semantic dependencies. Thus a parser for a lexicalized grammar must search a large set of supertags to choose the right ones to combine for the parse of the sentence. We present techniques for disambi...

متن کامل

Learning Head-modifier Pairs to Improve Lexicalized Dependency Parsing on a Chinese Treebank

Due to the data sparseness problem, the lexical information from a treebank for a lexicalized parser could be insufficient. This paper proposes an approach to learn head-modifier pairs from a raw corpus, and to integrate them into a lexicalized dependency parser to parse a Chinese Treebank. Experimental results show that this approach not only enlarged the coverage of bi-lexical dependency, but...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004